Graduated from Carnegie Mellon University’s top ranked Statistics and Economics programs with University and College Honors, while simultaneously performing research culminating in two senior thesis projects. In addition, I concurrently completed internship and obtained industry experience tailored to a career in data analytics where I made substantial contributions and gained in-depth experience with the entire data life-cycle.
This plot shows a border precinct matching algorithm for Allegheny County, Pennsylvania. The algorithm is generalized to run on any county.
This scatterplot shows a Principle Component Analysis on Supreme Court (SCOTUS) cases. When shifting from the 2022 to the 2023 Term, the horizontal axis becomes much more important and the court shifts from 3 clusters to 2.
This graph pairs Supreme Court (SCOTUS) justices to show how often they voted together for the 2022 and 2023 Terms.
This map shows all wind farms in the contiguous United States along with their custom 3-dimensional turbine density metrics, evaluating turbine wind interference.
This plot shows the decrease in debt service payments as a percent of disposable personal income from 2001 to 2020 as a factor of unemployment.
Statistical Graphics Visualization Team Competition 1st place award: 20 statistics professionals judging 26 teams.
Economics and Data Science Individual Visualization Challenge awarded 1st place of 67 entries: score of 98/100.
## Warning: package 'fpp2' was built under R version 4.1.3
## Registered S3 method overwritten by 'quantmod':
## method from
## as.zoo.data.frame zoo
## -- Attaching packages ---------------------------------------------- fpp2 2.5 --
## v ggplot2 3.5.1 v fma 2.5
## v forecast 8.22.0 v expsmooth 2.3
## Warning: package 'fma' was built under R version 4.1.3
## Warning: package 'expsmooth' was built under R version 4.1.3
##
## `geom_smooth()` using formula = 'y ~ x'
##
## Call:
## tslm(formula = Consumption ~ Income, data = uschange)
##
## Coefficients:
## (Intercept) Income
## 0.5451 0.2806
##
## Call:
## tslm(formula = Consumption ~ Income + Production + Unemployment +
## Savings, data = uschange)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.88296 -0.17638 -0.03679 0.15251 1.20553
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.26729 0.03721 7.184 1.68e-11 ***
## Income 0.71449 0.04219 16.934 < 2e-16 ***
## Production 0.04589 0.02588 1.773 0.0778 .
## Unemployment -0.20477 0.10550 -1.941 0.0538 .
## Savings -0.04527 0.00278 -16.287 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.3286 on 182 degrees of freedom
## Multiple R-squared: 0.754, Adjusted R-squared: 0.7486
## F-statistic: 139.5 on 4 and 182 DF, p-value: < 2.2e-16
##
## Call:
## tslm(formula = aussies ~ guinearice)
##
## Residuals:
## Min 1Q Median 3Q Max
## -5.9448 -1.8917 -0.3272 1.8620 10.4210
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -7.493 1.203 -6.229 2.25e-07 ***
## guinearice 40.288 1.337 30.135 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.239 on 40 degrees of freedom
## Multiple R-squared: 0.9578, Adjusted R-squared: 0.9568
## F-statistic: 908.1 on 1 and 40 DF, p-value: < 2.2e-16
##
## Call:
## tslm(formula = beer2 ~ trend + fourier(beer2, K = 2))
##
## Residuals:
## Min 1Q Median 3Q Max
## -42.903 -7.599 -0.459 7.991 21.789
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 446.87920 2.87321 155.533 < 2e-16 ***
## trend -0.34027 0.06657 -5.111 2.73e-06 ***
## fourier(beer2, K = 2)S1-4 8.91082 2.01125 4.430 3.45e-05 ***
## fourier(beer2, K = 2)C1-4 53.72807 2.01125 26.714 < 2e-16 ***
## fourier(beer2, K = 2)C2-4 13.98958 1.42256 9.834 9.26e-15 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 12.23 on 69 degrees of freedom
## Multiple R-squared: 0.9243, Adjusted R-squared: 0.9199
## F-statistic: 210.7 on 4 and 69 DF, p-value: < 2.2e-16
The first argument to fourier() allows it to identify the seasonal period mm and the length of the predictors to return. The second argument K specifies how many pairs of sin and cos terms to include. The maximum allowed is K=m/2K=m/2 where mm is the seasonal period. Because we have used the maximum here, the results are identical to those obtained when using seasonal dummy variables.
## CV AIC AICc BIC AdjR2
## 0.1163477 -409.2980298 -408.8313631 -389.9113781 0.7485856